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Abstract 



Nonparametric methods for the estimation of the Levy density 
of a Levy process X are developed. Estimators that can be written 
in terms of the "jumps" of X are introduced, and so are discrete- 
data based approximations. A model selection approach made up of 
two steps is investigated. The first step consists in the selection of a 
good estimator from a linear model of proposed Levy densities, while 
the second is a data-driven selection of a linear model among a given 
collection of linear models. By providing lower bounds for the minimax 
risk of estimation over Besov Levy densities, our estimators are shown 
to achieve the "best" rate of convergence. A numerical study for 
the case of histogram estimators and for variance Gamma processes, 
models of key importance in risky asset price modeling driven by Levy 
processes, is presented. 



1 Introduction 

The class of Levy processes is central to the theory of stochastic processes (see 
and in for excellent monographs on the topic). Recently, new subclasses 
of Levy processes have been introduced and actively investigated mostly be- 
cause of their relevance to mathematical finance. Among the better known 
models are the variance Gamma model of jTB], the CGMY model of jHj, and 
the generalized hyperbolic motion of |3] and [12] (see also [Ij and [H!)- This 
phenomenon is not surprising if one brings to mind the traditional model for 
risky assets, namely the Black-Scholes model. In this model the price S{t) of 
an asset at time t is assumed to be governed by 

S{t) = 5(0)e'^^W+'^*, 

where B{t) is a standard Brownian motion. However, a well-documented 
empirical evidence against the Black-Scholes model, specially in describing 
high-frequency data and option prices, have led researchers to consider non- 
Gaussian based models (see for instance [H], [T3j, 05 111, [IH|, and references 
therein). The transition to Levy processes is natural since these preserve 
the statistical qualities of Brownian Motion's increments, but relax the path 
continuity by allowing jump-alike discontinuities (a specification that is more 
consistent with the real evolution of stock prices through time). Another 
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point that naturally led to Levy processes was the use of economic-relevant 
random clocks (like volume or number of trades) instead of the historical 
time. Specifically, a more robust and sensible model is to take 



5(t) = 5(0)e'^^(^W)+'^^W, 



(1.1) 



where T(t) is a general increasing stochastic process with T(0) = 0. In 
that case, if T(t) has independent and stationary increments, the log return 
process is necessarily a Levy process (see 30.1 in |M1)- Such considerations 
led to the study of exponential Levy processes of the form 



where X{t) is a Levy process. This paradigm has proved to be successful 
to account for many of the empirical features of financial data. However, 
among other drawbacks, the high computational intensity and numerical 
issues involved in calibrating such models have prevented them from being 
more widely used in practice. In particular, these difficulties become very 
serious when dealing with "high-frequency" data. 

Levy processes are determined by three "parameters": a non-negative real 
0"^, a real n, and a measure u on M\{0}. These three parameters characterize 
the dynamic of a Levy process {X(t)}j>o as the superposition of a Brownian 
motion with drift, aB{t) + /it, and a pure-jump Levy process whose jump 
behavior is specified by the measure v as follows: 



where AX(t) = X{t)—X{t~) is the jump of X at time t and A is such that the 
indicator Xa{') vanishes in a neighborhood of the origin (this is a consequence 
of the so called Levy-Ito decomposition for the sample paths of processes with 
independent increments; see Theorem 13.4. of [21j or Section 19 of |S1])- We 
assume throughout that z/ is determined by a function p : M\{0} [0, oo), 
called the Levy density, in the following sense: 



J A 

In that case, the value of p at Xq provides, roughly speaking, information on 
the frequency of jumps with sizes "close" to Xq. 



S(t) = S{0)e 




V 



(^) = tE $^X^(AX(.)) 
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Estimating the Levy density poses a nontrivial problem, even when p takes 
simple parametric forms. Parsimonious Levy densities usually produces not 
only intractable but sometimes not even expressible densities for the marginals 
t > 0. The current practice of estimation relies on approximations 
of the density function using inversion formulas combined with likelihood 
methods (see for instance flj). Such approximations make the estimation 
particularly susceptible to numerical errors and mis-specification; that is, 
slight changes in the model can produce quite different results. It is impor- 
tant to notice that these problems become quite critical for "high-frequency" 
data. Other common calibration methods include simulation based methods 
and multinomial log likelihoods (see for instance |2S1 and [TU]). 

In the present paper, we introduce new estimation methods for the Levy 
density. We concentrate on model-free estimation schemes that allow to effi- 
ciently retrieve a fairly general Levy density. Being nonparametric, we relax 
the dependency on the model and expect that data itself validates the best 
model. Three theories serve as foundations for our methodology: i) the char- 
acterization of the jumps associated with a Levy process as a spatial Poisson 
process, ii) some recent methods for the nonparametric estimation of spa- 
tial Poisson processes introduced in jSU], and iii) the short-term properties 
of Levy processes to approximate jump-dependent quantities. To the best 
of our knowledge, such connection between the Levy density and the sta- 
tistical properties of the process in small time spans has not been used for 
calibration purposes before the present work. It is relevant to point out that 
our procedures are suitable for high-frequency data, which is widely available 
nowadays. Furthermore, it is precisely for such data that standard statistical 
estimation methods are not viable, the traditional geometric Brownian mo- 
tion model is totally inaccurate, and general exponential Levy models may 
be more relevant. 

Let us describe the outline of the paper. In Section |2l we construct func- 
tional estimators which can be written in terms of integrals of deterministic 
functions with respect to the random measure associated with the jumps of 
X. The proposed method follows the reasoning of the works on minimum 
contrast estimation on sieves and model selection developed in the context of 
density estimation and nonlinear regression in JTj (see jH] and ^2]) and re- 
cently extended to the estimation of intensity functions for Poisson processes 
in PO]. Concretely, the procedure addresses two problems: 1) the selection 
of a good estimator, called the projection estimator, from a linear model S 
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of possible estimators, and 2) the selection of a linear model among a given 
collection of linear models using a penalization technique that led to a penal- 
ized projection estimator (p.p.e.). A bound for the risk of the p.p.e. is found 
in Section 01 As a consequence, Oracle inequalities, that ensure to approx- 
imately reach the best expected error (using projection estimators) up to a 
constant, are obtained. We also assess the rate of convergence of the p.p.e. 
on regular splines, when the Levy density belongs to some Besov spaces. By 
analyzing the minimax risk of estimation on these Besov spaces, it is actu- 
ally proved in Section HI that the p.p.e. attains the best possible rate in the 
minimax sense, when the estimation is based on jumps bounded away from 
the origin. In Sections El and IHl we examine the problem that the Poisson 
jump measure cannot be retrieved from discrete observation, and devise an 
approximation procedure for Poisson integrals based on equally space sam- 
pling observations of the process. Finally, in the last part our methods are 
applied to the estimation of a classical model used in mathematical finance: 
the Variance Gamma model of 16 . The Levy processes are simulated using 
time series representations and "discrete skeletons", whereas the considered 
estimators are mainly regular histograms. 



2 A model-free estimation method 

Consider a real Levy process X = {X(t)}^y^ with Levy density p. That is, 
X is a cadlag process with independent and^ stationary increments such that 
the characteristic function of its marginals is given by 

E [e™^W] = exp |t (^iub - ^ + J {e'"^ - 1 - mxl[|^|<i] } pix)dx^ | , 

(2.1) 

where Rq = R\{0} and p : Mq ^ IR+ satisfies 

/ {1 Ax^)p{x)dx < oo. (2.2) 

Being a cadlag process, the set of jump times {t > : X{t) — X{t^) > 0} is 
countable and, for Borel subsets B of [0, oo) x Mq, 

J{B) = # {t > : (t, X{t) - X(r )) G B} , (2.3) 
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is a well-defined random measure on [0, oo) x Mqi with # denoting cardinality. 
The Levy-Ito decomposition of the sample paths (see Theorem 19.2 of 
implies that is a Poisson process on the Borel sets of ;B([0, oo) x ffio) with 
mean measure given by 

MB)^//pW*... (2.4) 

B 

We study the problem of estimating the Levy density p on a Borel set D e 
B (Mq) using a projection estimation approach. According to this paradigm, 
p is estimated by estimating the best approximating function in a finite- 
dimensional linear space S. The linear space S is taken so that it has good 
approximation qualities in general classes of functions. Typical choices are 
piecewise polynomials or wavelets. In order for this approach to be general 
enough but still feassible, it is usually assumed that the function to be esti- 
mated is bounded and belongs to an space on D, simplying the task of 
specifying the best approximating function. The simplest case is when p is 
taken bounded and Jjyp'^{x)dx < oo. This condition is quite general if D is 
away from the origin, since ()2.2|) entails 

p'^{x)dx < oo, (2.5) 

|a;|>e 

for any e > 0, when p is bounded on {x : |a;| > e}. However, around the 
origin the Levy density is not bounded in most applications. This motivates 
the use of measures different from the Lebesgue measure. Concretely, it is 
assumed that the Levy measure v{dx) = p{x)dx is absolutely continuous 
with respect to a known measure rj on B (D) and that the Radon-Nikodym 
derivative 

-^(x) = s(x), xeD, (2.6) 
dr] 

is positive, bounded, and satisfies 



s'^{x)r]{dx) < oo. (2.7) 

D 



Definition 2.1 // i2. 6|) and \2. 7| ) are verified, we say that rj is a regularizing 
measure for the Levy density p. In that case, s is referred to as the regularized 
(under 7]) Levy density of p on D. 
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Notice that under the previous regularization assumption, the measure J 
of (|2.3|) . when restricted to i3([0,oo) x D\ is a Poisson process with mean 
measure 

^[B)= s{x)dtr]{dx), B e I3{[0,oo) x D). (2.8) 



B 

Our goal will be to estimate the regularized Levy density s, and using ()2.6p 
to retrieve p on D from s. To illustrate this strategy consider a continuous 
Levy density p such that 

p{x) = O (x^^) , as X — > 0. 

This type of densities admit the regularizing measure r]{dx) = x~'^dx on 
domains of the form D = {x : Q < \x\ < h}. Indeed, s{x) = x'^p{x) will 
be bounded and fulfills (j2.7|) . Clearly, each estimator s for s will induce the 
natural estimator x~'^s{x) for p. 

The previous methodology is motivated by recent results on the estimation 
of intensity functions of non-homogeneous Poisson processes (see [201 )• In 
that paper, a type of projection estimator is proposed, whereas penalized 
projection estimation is used as a data-driven criterion for selecting the best 
space among a family of linear spaces. However, these procedures focus 
on finite Poisson point processes and on classes of intensity functions that 
are defined with respect to a finite reference measure (see Section El for a 
more detailed description of this hypothesis). Actually, the value of the 
reference measure plays a key role in the definitions of projection estimators 
and penalization. Our job in this section is to implement and justify a 
projection estimation approach that does not rely on the finiteness of the 
Poisson process. 

Let us describe the main ingredients of our procedure. Consider the random 
functional 

7^(/) ^-f JJ /(^) J{dt, dx) + j fix) T^idx), (2.9) 

[0,T]xD D 

which is well defined for any function f & L? {{D, t])), where D & B (Mq) and 
Tj is as in equations ()2.6|) - ()2.8|) . Following the terminology of ^I] and pUj . 
we call 7z) the contrast function. Throughout this section, 

f{x)T]{dx), 

D 
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for any / G L^((D,?7)). Let 5 be a finite dimensional subspace of = 
L? {{D,-)])). The projection estimator of s on S" is defined by 



six. 



(2.10) 



i=l 



where {(pi, . . . , (pd} is an arbitrary orthonormal basis of S and 



A = ^ 



^i{x)J{dt, dx). 



(2.11) 



[0,T]xD 

Let us give another characterization of the projection estimator. 



Remark 2.2 The projection estimator is the unique minimizer of the con- 
trast function ■jd over S. Indeed, plugging f = Yli=iPi'^i ^EIB) gives 

Mf) = Eti f-2AA + P!) , and thus, 7d(/) > - Eti Pf, for all feS. 



In particular, this characterization implies that s does not depend on the 
choice of the orthonormal basis, and suggests a mechanism to numerically 
approximate s when we do not have an explicit orthonormal basis for S. 

The remark above helps to make sense of s as an estimator of the regularized 
Levy density s because the minimizer of E [yoif)] over all / G iS is precisely 
the closest function in S to s. Concretely, the orthogonal projection of s on 
the subspace S, namely 



is such that 



E[7d(s^)] <E[7z,(/)], V/g5. 



(2.12) 



(2.13) 



Moreover, we can readily corroborate that s is an unbiased estimator of the 
orthogonal projection s"*". In order to assess the quality of estimation, we 
compute the "square error" of s: 



X 



i=l 



ipi{x) 



J{dt, dx) — s{x) dt rj{dx) 



[0,T]xD 



(2.14) 
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Then, by the standard formula for the variance of Poisson integrals, the mean 
square error takes the form 

1 r 

E M =fJ2 vK^Hx)v{dx). (2.15) 
i=i I 

The quantity E [x^] is called the variance term and the equation above shows 
that this term will shrink to when the time horizon T goes to infinity. 
Moreover, the risk of s, E [\\s — can be decomposed into a nonrandom 
term plus the previous variance term: 

E [||s - sf ] = ||s - s^f + E [x^] . (2.16) 

The first term, called the bias term, accounts for the distance of the unknown 
function s to the model S and does not depend on the estimation criteria we 
use within the model. 

The next natural problem to tackle is to design a data-driven scheme for 
selecting a "good" model from a collection of linear models {iSm,m e Ai}. 
Namely, we wish to select a model that approximately realizes the best trade- 
off between the risk of estimation within the model and the distance of the 
unknown Levy density to the model. Let Sm and be respectively the 
projection estimator and the orthogonal projection of s on Sm- For each 
m G A^, let Xm tie as in (j2.14p . The following simplifications of (j2.16|) give 
insight on a possible solution: 



E[\\s-Sm\\'] = \\s-si\\^ + E[xl] 

= \\s\\'-\\sif + E[xi;\ (2.17) 
= ll^f-E[pn.f]+2E [xl] 
= + E [jD (sm) + pen(m)] , 

where pen(m) is defined in terms of an orthonormal basis {fi^m, ■ ■ ■ , fd,n,m} 
for Sm by the equation: 

pen(m) = A j' j' ^2^(a;) j J(rft, da;). (2.18) 



2^2 

[0,T]xD 



Equation ()2.17|) shows that the risk of Sm moves "parallel" to the expectation 
of the observable statistics 'Jd (s^) +pen(m). This fact heuristically justifies 
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to choose the model that minimizes such a penahzed contrast value. In gen- 
eral, it makes sense to consider penalized projection estimators (p.p.e.) 
of the form 



where pen : — > [0, oo), Sm is the projection estimator on Sm (see ()2.1U|) ). 
and m = argmin^g_v( {in (sm) + pen(m)} . 

Methods of estimation based on the minimization of penalty functions have 
a long history in the literature on regression and density estimation (for in- 
stance, 12], [2ni, and The general idea is to choose among a given 
collection of parametric models the model that minimizes a loss function 
plus a penalty term that controls the complexity of the model. Such penal- 
ized estimation was promoted for nonparametric density estimation in [T^ . 
and in the context of non-homogeneous Poisson processes in [30j. There are 
two main accomplishments obtained in these works both in the context of 
density estimation and intensity estimation of nonhomogeneous Poisson pro- 
cesses: Oracles inequalities and competitive performance against minimax 
estimators. The following section shows that the method outlined above 
preserves Oracle inequalities. 

3 Risk bounds, oracle inequalities, and rates 
of convergence 

Consider the problem of model selection among a collection of linear models, 
{Sm,iTi e M.}, for the regularized Levy density s on D as outlined in the 
previous section. We showed through ()2.17p that a sensible criterion to decide 
for a projection estimator is to penalize its contrast value with a properly 
chosen penalty function pen : A4 [0,oo). Of course, the "best" model, 
namely 



is not accessible, but we can aspire to achieve the smallest possible risk up to 
a constant. In other words, it is desirable that our estimator s comply with 
an inequality of the form 



S Srn 





(3.1) 




(3.2) 
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for a constant C independent of the linear models. The model that achieves 
the minimal risk of projection estimation is called the Oracle model and in- 
equalities of the type ()3.2|) are called Oracle inequalities. Approximate Oracle 
inequalities were proved in jSU] for the intensity function of a nonhomoge- 
neous Poisson process {iV^j^gy on a measurable space (V, V). Concretely, 
pUj defines projection estimators Sm and penalized projection estimators s 
satisfying 



E 



S V — S V ' 



V 



av) 



< C inf E 



S V - S™, V ' 



c 

+ 



avy 

(3.3) 

where s and ( are respectively a bounded measurable function and a finite 
measure on V such that 



E[Na] = [ s(v)rfC(v), AeV. 

J A 



The finiteness of ( plays an important role in the definition of the estimators, 
and in obtaining the Oracle inequality ()3.3|) . However, such a property is not 
necessarily satisfied by the mean measure of the Poisson process J7'(-) of 
(Q on B{[0,T] X D) (for instance, ii D = {\x\ > e} under ((dv) = dxdt 
as in ()2.4|) . or if D = {0 < < b} and C{d\) = x~'^dxdt as in the example 
described after Definition 12.11) . In this section we show that, based on one 
sample of the Levy process X on [0,T], the projection estimators {sm}mGM 
introduced in Section |2l and certain penalized projection estimators s satisfy 
the approximate Oracle inequality 

E[\\s-:sr]<C inf E[||s-S„f]+^, 

where s is a regularized Levy density, and the constants C, C depend only 
on the "complexity" of the family of linear models. Actually, we will be able 
to estimate the order of the constants C and C appearing in the Oracle 
inequahty. 

The main tool in obtaining Oracle inequalities is an upper bound for the 
risk of the penalized projection estimator s of ()2.19|) . The proof of this 
bound is a simple variation of the argument of [SOI; however, to overcome the 
possible lack of finiteness on ( and to avoid unnecessary use of upper bounds, 
the dimension of the linear model is explicitly included in the penalization. 
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Finally, the obtained risk bound is used to assess the rate of convergence of 
s to s in the long run (as T —* oo) when s is "smooth" and the considered 
linear spaces are piecewise polynomials. 

The following regularity condition was introduced in [3Uj to make a distinc- 
tion between not too "large" families of linear models and certain wavelet- 
type linear models. We will focus here on the simplest case: 

Definition 3.1 ^4 collection of models {Sm, m & Ai} is said to be polynomial 
if there exist constants F > and R> such that for every positive integer 
n 

^{m e M : dm = n} < Tn^, 

where dm stands for the dimension of the model Sm, while 7^ denotes cardi- 
nality. 

Below, we return to the setting of Section |21 that is to say, X = {-^(i)}o<i<T 
is a Levy process with Levy density p and regularized Levy density s on 
a domain D G i3 (Mq) under a regularizing measure rj (see Definition 12. 1|) . 
Define also 

Dm = sup jll/IlL -.feSm, ll/ir = fixMdx) = l| . (3.4) 

Remark 3.2 // {(fi^m, ■ ■ ■ , fdm,m} is an arbitrary orthonormal basis of Sm, 
then Dm = \\ J2i=iVl m\\oo (see Section for a verification). 

We now present the main result of this section (see Section IHH] for the proof): 

Theorem 3.3 Let {Sm,fn G A4} be a polynomial family of finite dimen- 
sional linear subspaces of L'^{{D,ri)) and let M.t = {'m G Ai : Dm < T}. If 
Sm CLnd Sm (^1"^ respectively the projection estimator and the orthogonal pro- 
jection of the regularized Levy density s on Sm then, the penalized projection 
estimator on {Sm}meMT ^^/^'^^^ ^EUB) is such that 

E[\\s-s,r]<C inf {||.-.^f + E[pen(m)]} + ^, (3.5) 

whenever pen : ^ [0, oo) takes one of the following forms for some fixed 
(but arbitrary) constants c > 1, c' > 0, and c" > 0: 
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(a) pen(m) > c-^^ + c'^, where U = J{[0,T] x D) is the number of 
jumps prior to T with sizes falling in D and where it is assumed that p = 
Jj^ s{x)rj{dx) < oo; 

(h) pen(m) > c^, where Vm is defined in terms of an orthonormal basis 
Wi,m}'!=i ofSm by 

Vm = ^ jj [Y.vl^{x)^J{dt,dx), (3.6) 



[o,r]xD 



and where it is assumed that 3 = infmeM '^n'^ > and that d) = inf me a4 ^ > 

Urn Cfm 

0; 

(c) pen(m) > c% + c'^ + c"^. 

Moreover, the constant C depends only on c, c' and c" , while C varies with 
c, d , c" , V, R, \\s\\, ||s||oo, p, P, and (p. 



Remark 3.4 In the Remark \9.5l the order of the constants C and C is 
analyzed. We will show that for c > 2 and for arbitrary e > 0, there is a 
constant C {e) (increasing) so that 

ns - 5f < (1 + e) inf {\\s-si\\^ + ¥. [pen(m)] } + (3.7) 

As a first use of tlie previous risk bound, we obtain Oracle inequalities for 
our p.p.e. The next corollary immediately follows from the first equality in 
()2.17|1 . equation ()2.15j) . and part (b) above: 

Corollary 3.5 In the setting of Theorem \S.^ if the penalty function is of 
the form pen(m) = c^, for every m G Air, /5 > 0? o,nd (p > 0, then 

E [\\s - SJ'] < C inf {E [\\s - U'] } + (3.8) 

for a constant Ci depending only on c, and a constant C2 depending on c, T, 
R, \\s\\, \\s\\oo, 13, and cj). 
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As a second application of (|3.5|) . we analyze the "long run" (T — > oo) rate of 
convergence of penalized projection estimators on regular piecewise polyno- 
mials, when the Levy density is "smooth" . More precisely, restricted to the 
window of estimation D = [a, b] C Mq, the Levy density s is assumed to be- 
long to the Besov space^ B'^ {hP{[a, b])) with some p G [2, oo] and a > (see 
Section 2.9-10 of ^7] for the definition). An important reason for the choice 
of this class of functions is the availability of estimates for the error of ap- 
proximation by splines'^, trigonometric polynomials, and wavelet expansions 
(see for instance Chapter 12 of ^7j, and Lemma 13 of [8 ). In particular, if 
iS^ denotes the space of piecewise polynomials of degree bounded by k, based 
on the regular partition of [a,b] with m pieces (m > 1), Theorem 12.2.4 in 
[T7j implies that for any s G {LP{[a,b])) with k > a — 1, there exists a 
constant C{s) such that 

d,{s,St)<C{s)m-^, (3.9) 

where dp is the distance induced by the L^-norm on {[a,b],dx). Actually, 
C{s) can be taken to be increasing on |s|bjj^(lp), the standard seminorm on 
(LP{[a,b])) (see (10.1) Chapter 2 in |T7J)" Combining (jSSl) with (jSSD, we 
obtain the following result (see Section 1^751 for a proof). 

Corollary 3.6 Let D = [a, b] C Mq (^iT-d let he the space of piecewise 
polynomials of degree at most k based on the regular partition of [a, b] with 
m pieces (m > 1). Following the notation of Theorem \3.^A let s^, be the 
penalized projection estimator on {^^jmeXr '"^^'^^ penalization pen(m) = 
+ c'^ + c"^ (for some fixed c > 1 and c',c" > 0). Then, if the 
restriction of the Levy density s to [a,b] is a member of B'^ (LP([a, 6])) with 
2 < p < oo and < a < + 1, then 

limsupT2"/(2"+i)E [II s - 5^ IP] < oo. 

Moreover, for any R> and L > 0, 

limsupT2"/(2"+i) sup E[||s-s^f] < oo, (3.10) 
T^oo see{iJ,L) 

where Q{R, L) consists of all Levy densities s such that ||s||L°o([a,fe]) < R, and s 
restricted to [a, 6] is a member ofB^ (LPda, b])) with seminorm \s\b^(lp) < L. 
-'^ These Besov spaces are also called Lipschitz or Holder spaces. 

^Piecewise polynomial functions / such that on each compact interval, / is made up of 
only finitely many polynomial pieces. 
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The previous result implies that the p.p.e. on regular splines has a rate 
of convergence of order 7"-2q!/(2q+i) ^^iq class of Besov Levy densities 
Q{R,L). We will see in the next section that the rate cannot be improved 
(see Corollarv 14.31 and Remark 14 .41) . 

4 On the minimax risk for the estimation of 
smooth Levy densities 

This section presents some results on the minimax risk of estimation for 
certain families of smooth Levy densities. Roughly speaking, a minimax risk 
on a given family B of "parameters" has the following general form: 

inf sup Es [d (s, s)] , 

« see 

where the inf is taken over all the estimators s (based on the available random 
data, whose law distribution is itself determined by the parameter s), and 
d{s, s) is a function that measures how distant s and s are from each other. In 
some sense, supgg0 [d (s, s)] measures the maximum error that can arise 
when using the estimator s. Therefore, an estimator that approximately 
accomplishes a minimax risk is desirable. Comparisons to the minimax risks 
is one of the most solicited measures of performance in statistical estimation. 
In fact, minimax type results have been obtained in very general contexts 
(see for instance [221 and [S] in the case of density estimation based on i.i.d. 
random variables, and and in the case of intensity estimation based 
on finite Poisson point processes). 

Since the jumps of a Levy process can be associated with a Poisson point pro- 
cess on M_|_ X M\{0}, many results and techniques for the statistical inference 
of Poisson processes can be translated into the context of Levy processes. 
Following this approach, we adapt below a result of Kutoyants [221 (The- 
orem 6.5) on the asymptotic minimax risk for the estimation of "smooth" 
intensity functions of a Poisson point processes on [0, 1], based on n indepen- 
dent copies. The idea of the proof is due to Ibragimov and Has'minskii and is 
based on the statistical tools for distributions satisfying the Local Asymptotic 
Normality (LAN) property (see Chapters II and Section IV. 5 of 22 ). Some 
generalizations and consequences are also deduced. 

Let us introduce a loss function £ : M ^ M with the following properties: 
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(i) £{■) is nonnegative, £(0) = but not identically 0, and continuous at 0; 

(ii) it is symmetric: i{u) = i{—u) for all u; 

(iii) for any c > 0, {u : i{u) < c} is a convex set; 

(iv) i{u) exp{£|-up} ^ as \u\ oo, for any e > 0. 

Consider the problem of estimating the Levy density s of a Levy process 
{X{t)}Q^^^rp. We are interested in the error of estimation at a fixed point 
Xq G Mq and in minimax results of the form: 

liminf jinfsupE, [£(T^(s^(a;o) -s(xo)))]| > 0, (4.1) 

where the infimum is over all the "estimators" Sj, based on the jumps of the 
Levy process {X(t)}o<t<Ty © is a collection of Levy densities, and 7 > 
is a constant depending on the family G. In other words, ()4.H) implies the 
existence of a lower bound -B > and a time Tq such that from that time on, 
all non-anticipative^ estimators will not do better than T~'^ uniformly on 
B, in the sense that there would exist an s G for which 

[/ {T^s,ixo)-s{xom>B. 

Therefore, the inequality (j4.H) impose a constraint on the rate of convergence 
at xq that the estimators can attain. By estimators, we mean a "process" 
s : Mo X ^ such that for each x G Mq? the random variable s{x; ■) 

is measurable with respect to the cx-field generated by the point process J', 
while for each u E Q, s(-; uj) is measurable with respect to the product cx-field. 

The considered Levy densities satisfy a Holder condition of order /5 on a 
given window of estimation. Concretely, fix an interval [a,b] C ]R\{0}, and 
let A; G {0, 1, ... } and /5 G (0, 1]. Define the family Qk+p {L; [a, b]) of functions 
/ : M\{0} M. such that f is k times differentiable on [a, b] and 

\f^'\xi)-f^'\x2)\<L\xi-X2f, V xuX2e[a,b]. (4.2) 

■^Here, non-anticipative means that the estimator is based on the jumps that occurred 
up to the present. 
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Below, L stands for the class of all Levy densities; that is, all functions 
s : Mq such that 



{x? A l) s{x)dx < 



oo. 



The following result is a minor variation of Theorem 6.5 of |^. For com- 
pleteness, we present its proof in Section 19.21 



Theorem 4.1 If Xq is an interior point of the interval [a,b] C M\{0}, then 
liminf jinf supEp [£ (t"/(2°+i) {s^{xo) - s{xo)))] 1 > 0, (4.3) 

where a := k + f3 , Q := C (1 {L; [a,b]), and the infimum is over all the 
estimators Sj, based on those jumps of the Levy process {X{t)}^^^^rp whose 
sizes lie in [a, b] . 



As already noticed in [22], the previous result can be strengthen to be in a 
certain sense uniform in xq G (a, b) (see Section IHUl for a proof) 

Corollary 4.2 With the notation and hypothesis of Theorem \4.1[ 

liminf (inf inf supE, {£ (T"/(2«+i) - s{x)))] 1 > 0. (4.4) 

T^oo 1^ sj, xe(a,b) see J 

Let us now apply the above assertion to obtain the long run minimax risk of 
measurable estimators, under the L^-norm. Here, measurable means that for 
each u & fl, s{-;u!) is a measurable function on {[a,b],B {[a,b])). In Section 
19.21 a proof is given. 



Corollary 4.3 Let [a,b] be a closed interval o/M\{0}, then 



liminfT2°/(2°+i) J inf sup E 



b 

{Sj,{x) — s(x))^ dx 



> 0, (4.5) 



where a := k + f3 , Q := C Cl (L; [a, 6]), and the infimum is over all the 
measurable estimators based on the jumps of the Levy process {X (t)} ^_^^_^rp 
whose sizes lie on [a, b] . 
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Remark 4.4 The proofs of the previous results can be readily modified to 
cover even smaller classes of Levy densities B. For instance, = £ fl 
QaiL] [ct)^]) n {s : ||s||L°o([a,6]) < R} ■ This class has a very close relationship 
with the family of Besov densities Q{R,L) introduced in iy.l(J\) . Indeed, the 
class Qa{L; [a,b]) is contained in (L°°([a,6])) (see Section 2.9 of fT^j. 
Smce (L°°) C B^ (L^), holds true on = e{R,L). Therefore, the 

p.p.e. on regular splines, described in the previous section, has the best pos- 
sible rate of convergence and moreover, achieves the minimax rate of conver- 
gence on Q{R,L). This type of property is called adaptivity in that, without 
knowing the smoothness of s (controlled by a), the p.p.e. reaches asymptoti- 
cally the minimax risk up to a constant. See for instance Section 4 of 18] for 
a discussion on adaptivity. 



5 Calibration based on discrete time data: 
approximation of Poisson integrals 

One drawback to the method outhned in Section |21 is that in general we do 
not observe the jumps of a Levy process X = {X(t)}^^Q. In practice, we can 
aspire to sample the process X{t) at discrete times, but we are neither able 
to measure the size of the jumps AX{t) = X{t) — X{t~) nor the times of 
jumps {t : AX(t) > 0}. Poisson integrals of the type 

/(/)^ ff f{x)J{dt,dx) = J2fi^X{t)), (5.1) 

[0,T]xRo 

are simply not accessible. In this section, we discuss the approximation of 
the integral (|5.1|) based on time series of the form {X (t^)}^^^, where = 

Let us motivate our approximation scheme. The natural way of interpolating 
the sample path of a Levy process from the sampling observations {X(t^)}^^Q 
is to take a cadlag piecewise constant approximation of the form 

n 

X"(t)^^X(t^Jl(tG[t^i,t^)), te[0,T), (5.2) 

k=l 

where as usual 1 is the indicator function of the corresponding set. It is quite 
simple to prove that converges to X at finitely many points with probabil- 
ity one (a quality shared by any right-continuous process X). Furthermore, 
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the approximated process X", having independent increments, converges to 
X in D[0,oo), under the Skorohod metric (see VI of j2H] and for concrete 
Example VI. 18). Hence, we might expect that 

n 

4 (/) = /(^^"(^)) = E / (^^) - ^ ' (5.3) 

t<T k=l 

converges to ()5.H) as n ^ oo. Indeed, we prove the weak convergence of ()5.3p 
to ()5.1|) using well-know facts on the transition distributions of X in small 
time (see for instance pp. 39 of [H], Corollary 8.9 of [34j, or Corollary 3 of 
jSni). More precisely. 

Lemma 5.1 Let X = {X(t)}^>g be a Levy process with Levy measure v. 
Then: 

1 ) For each a > 0, 

lim -P {X{t) > a) = z/([a, oo)), and lim -P (X(t) < -a) = z/((-oo, -a]). 

t ^0 t ^-O ^ 

(5.4) 

2 ) For any continuous bounded function h vanishing on a neighborhood of 
the origin, 

\im-E[h{X{t))]= I h{x)u{dx). (5.5) 

Remark 5.2 In particular, the two parts in the previous Lemma imply \5. ,51) 
when h{x) = l(^a,b]{x)f{x), where [a,b] is an interval o/Mq and f is a contin- 
uous function. 



It is worth mentioning that [oo] provides stronger results for the small-time 
distributional properties of X[t). The following theorem summarizes some 
of their results. 

Theorem 5.3 Let X = {X(t)}^^Q be a Levy process with Levy measure v. 
Let Ft be the distribution function of X{t) and G the spectral function of u; 
i.e. G{x) = h'{[x,ooj) for x > and G{x) = z/((— cxd, a;]) for x < 0. The 
following properties hold: 

(i) If Ft and G have densities'^ ft and g, then for x ^ 

1 d 

lim - ft{x) = —ft{x) =g{x), (5.6) 
t^o t ot t=0 



^Tlie function <? > is said to be the density of the spectral function G if G'{x) = g{x) 
for X < and G'{x) = —g{x) for a: > 0. 
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where we additionally assume that Ft{x) is continuous in a neighborhood of 
{t = 0,x) and that moreover {d/dt)Ft{x), {d/dx)Ft{x), and {d / dt){d / dx)Ft{x) 
exist and are continuous in (t = 0,x). 

If h is continuous and bounded and if lim\^\^o h{x)\x\''^ = 0, then 



lim-E[/i(X(t))] = [ h{x)u{dx). 



Moreover, if j^^^{\x\ A l)v{dx) < oc, it is enough to postulate that h{x){\x\ A 
l)""*^ is continuous and bounded. 

Limiting results like ()5.5|) are useful to establish the convergence in distribu- 
tion of In (/) since 

where a„ = nE [h [X (^))] with h{x) = e'"^(^) - 1. So, if / is such that 

lim ^E [e'"^(^W) - 1] = / (e"^^"^ " l) ^i^x), (5.7) 
then a„ converges to a = T J^^^ h{x)u{dx), and thus 

lim fi + ^y= lim e"Mi+^)=e». 

n— >oo V n / n— >oo 

We thus have the following result (see Section for verification): 

Proposition 5.4 Let X = {X{t)}^yQ be a Levy process with Levy measure 
u. Then, 

lim E rei"^"(/)l = exp |t / (e'"^^^) - l) p{dx) \ , 
if f satisfies either one of the following: 

1) f{x) = l(^ab]{x)h{x) for an interval [a,b] C Mq (^iT-d a continuous function 
h; 

2) f{x) is continuous on Mq (in'd lim\x\^Q f {x)\x\~'^ = 0. 

In particular, In{f) converges in distribution to I{f) under any of the two 
previous conditions. 
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Remark 5.5 Clearly, if f and satisfy 45. 5)) . then the mean and variance 
of In{f) obey the asymptotics: 



lim E [/„(/)] =T / fixMdx); 
lim Var [/,(/)] =T [ f\x)p{dx). 



6 Estimation Method 



Let us summarize the previous sections and outline the proposed algorithm 
of estimation: 



Statistician's parameters: The procedure is fed with a Borel window of 
estimation D C Mq, a collection {Sm}rn&M finite dimensional linear 
models of {{D,rj)), and a /ewe/ of penalization c > 1. 

Model and data: It is assumed that a Levy process {X (i)}tg[oT] mon- 
itored at equally spaced times = k^, k = 1, . . . ,n, during the time 
period [0,T]. The data consists of the time series {X (t^)}^^^. The 
Levy process admits a regularized Levy density s under the measure r] 
on D (see Definition 12.11) . 

Estimators: Inside the linear model Sm, the estimator of s is the approxi- 
mated projection estimator: 

dm 

C(a;) = 5^A>,„.(a;), (6.1) 
1=1 

where {yji • • • , fd„,,m} is an orthonormal basis for Sm, and 

n 

fc=l 

is the estimator of the inner product Pi^m = ^ j^^i,m{,x)s{x)rj{dx) , for 
i = 1, . . . ,dm- Across the collection of linear models {Sm : m E A4}, 
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the estimator which minimizes — p^p + c pen"'(m), is selected, 
where 



pen"(m) 



k=l \i=l ) 



Remark 6.1 It is worthwhile to point out the great similarity of the scheme 
above to some methods of density estimation introduced in ^121. In this paper, 
the authors estimate the probability density function f of a random sample 
Xi, ■ ■ ■ , Xn by projection estimators of the form: 

/>) = EUE^^™U^(^), (6.3) 

1=1 I k=l ) 

where {^lYi^^ is an orthonormal basis of a linear space S of L'^{'R, dx). More 
generally, f can be the density function with respect to a measure fi in the 
sense that P [Xi G ■] = / f{x)fi{dx), and the projection estimator above will 
be well defined provided that f G L^(M, To solve the problem of model 
selection, they introduced penalized projection estimators. One considered 
penalty function there is 

„ n d 

In some sense, the method outlined at the beginning of this section "works" 
as a byproduct of the small time qualities of Levy processes and of standard 
methods of nonparametric estimation for probability densities. Indeed, con- 
sider the statistics 

^:'i^^^i2^^,rn {X{tl)-X{tU)), 

k=l 

where T/n is the time span of the increments and j is the number of incre- 
ments in the sample. From /77) /. as j progresses, 

dm 

s-^^{x)^Y.^P:;i^,,^{x\ (6.4) 
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estimates the orthogonal projection of j;fT/n{x) on Sm, where ft stands for 
the probability density function of X{t) (if it exists). On the other hand, 
proves that ^fr/nix) converges to the Levy density p, as n oo (under some 
regularity conditions). Therefore, for large n and j, (|6'.^| ) will approximate 
the projection of p on Sm- Notice that in general, a.s. 

lim lim;^^yp(X(t^)-X(tLi)) = 

k=l 

whenever ip is such that the limit i5.^} holds. Our penalized projection esti- 
mators iS.l]} are obtained from ^6.4\ ) by taking n = j . It is not clear from 
the references just mentioned whether taking n = j ^ oo will produce good 
results or not. We will see below that this is the case. 

Let TZ{X) be the linear space of measurable functions h such that ()5.5|) is 
satisfied. For instance, 7^(X) contains the functions / satisfying conditions 
(1) or (2) in Proposition 15 .41 The following result holds true (see Section 1^31 
for a proof). 

Proposition 6.2 Let be the orthogonal projection of s on Sm- If fi,m 

and (ff^ belong to TZ{X) for every m & M. and i = l,...,dm, then the 
approximated projection estimator of s on Sm (based on n equally spaced 
observations) satisfies: 

y^^^[\\sl - sir]=¥.\^\Sm- Sif]. (6.5) 
n— >cxD 

Moreover, 

limE[p^-sf] =E[p„-sf]. 

7 Numerical tests of projection estimators 

In this section, we try to assess the performance of some penalized projection 
estimators based on simulated Levy processes. Piecewise constant functions 
are considered, and for their intrinsic relevance in mathematical finance, two 
classes of Levy processes are studied: Gamma and variance Gamma pro- 
cesses. A method of least-squares errors is also applied to generate paramet- 
ric Levy densities that closely fit the nonparametric outputs. 



ip{x)i'{dx), 
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7.1 Specifications of the statistical methods 



Let us describe in greater details the considered projection estimators. To 
simplify notation, J {B) is used instead of J7'([0,T] x B) of ()2.3p when re- 
ferring to the number of jumps of sizes in i? G i3(Mo) occurring prior to 
T. Let C : a = Xq < Xi < ■ ■ ■ < Xm = & be a partition of the interval 
D = [a,b] (0 < a or 6 < 0), and let Sq be the span of the indicator functions 
X[xo,xi), ■ ■ ■ , Xlxm-i,xm)- othcr words, the linear model Sc consists of "his- 
togram functions" on the window D with cutoff points in C. We assume that 
the Levy process has a Levy density s bounded outside of any neighborhood 
of the origin. This assumption is very mild, and yet good enough for the 
integral jj^s'^{x)dx to be finite. In that case, the orthogonal projection of 
s onto Sc exists (under the standard inner product of {D, dx)), and thus 
the projection estimation on Sc is meaningful. In the terminology of Section 
the regularizingn measure is simply dx, the regularized Levy density coin- 
cides with the Levy density, and the orthonormal basis {ipi, . . . , ipm} for Sc 
is 

1 



'^''^^^ = / _^ X[x,.^,x,){x), i = 1 



According to the basic estimation method outlined in Section the projec- 
tion estimator on the linear model Sc is given by 

Sc{x) = 7^ > X[x,-r,x,){x). (7.1 

r ^ Xi- Xi-i 
1=1 

Following the heuristics of Section |21 and Theorem 13 . 31 part (b), an appealing 
procedure to select a projection estimator of the form (j7.H) is to look for the 
minimization of the following penalized contrast value 

1 ^ 1 

^ E ^([^-1' ^^)) - [^([^-1' ^^))]'} • (7-2) 

J- . Xi Xi^i 

1=1 

Here, c > 1 is a constant that controls the level of penalization. In fact. 
Theorem 13 . 31 and Corollarv 13 . 81 ensure that, for large enough T, the previous 
procedure will yield competitive results against the best projection estimator. 
For that to happen it is necessary to restrict ourselves to models C satisfying 
Dc < T, where Dc is defined as in (j3.4j) . In this case, the constant Dc is 
1/ mini<j<m{xj — Xi^i} as seen from Remark | 
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The simplest case is to take regular partitions {xi = a + iAx}^Q, where 
Ax = (6 — a)/m is the mesh of the partition. Then, the projection esti- 
mators of ()7.1|) becomes 



T{b - a 



J{[xi-i,Xi)) X[x,^i,x,){x), (7.3) 



1=1 

and penalized projection estimation will look to minimize 



m 



over all m such that = m/{h — a) is smaller than T. 

For comparisons against other procedures and to assess the goodness of fit 
to specific parametric models, it is useful to determine the parametric model 
of a given type that "best fits" our non-parametric estimators; for instance, 
suppose the we want to assess whether or not the nonparametric results 
supports the parametric Gamma model for the Levy density. The method of 
least square errors provides an easy solution to this problem. For instance, 
if sq{x) is the parametric form of the Levy density, a plausible estimator of 
e is 

6 = argmiug d{s0, s), 

where s is the (penalized) projection estimator on a given family of linear 
models, and is a function that accounts for the difference between sq and 
s. For instance, for a fixed set of points {xi}^^^ C D, d{-, ■) can simply be 
defined for functions / and g as 



d{f,g) = [f{xi) -g{xi 



i=l 



It is sometimes preferable to use a least-square method that is linear in the 
parameters, and hence, is robust against numerical errors. In that case, we 
can look for a functional L so that L(se) is linear in 6 and define 



d(/,^?)^^[L(/)(x.)-L((7)(x.)f 

i=l 
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As an example, consider the Levy density of a Gamma Levy process with 
parameters a and /?: 



s{x) = -e-"/^, X > 0. 

X 



Given a projection estimator s of s, least-square estimates of a and (3 can be 
constructed from 

argmin^^^ ^ (^^ exp (^-j^ - s{xi)^ , (7.5) 

where {xj}^^^ C D. Notice that the estimation would be very susceptible to 
the points close to the origin. Instead, a regression method that is linear in 
the parameters can be devised using a logarithmic transformation as follows 

/ 1 \^ 
argmin^^^ ^ V ^ log(") - log(xis(xi)) J . (7.6) 



7.2 Estimation of Gamma Levy densities. 
7.2.1 The model 

Levy Gamma processes are fundamental building blocks in the construction 
of other Levy processes like the variance Gamma model |Tn| and the general- 
ized Gamma convolutions ^H]- Moreover, by Bernstein's theorem, any Levy 
density of the form u{x)/\x\, where u is a completely monotone function, is 
the limit of superpositions of Gamma Levy densities. 

The Gamma Levy process X = {X{t)}^yQ is determined by two positive 
parameters a and (3 so that the probability density function of X{t) is 

^at-l^-x/f3 

for X > 0. The characteristic function of X{t) is 

E [e™^(*)] = (1 - iptf' = exp t (^a (e'"" - l) u{dx] 
where the Levy measure v is 

I X\ 

uidx) = — exp I — — dx, for x > 0; (7.8) 
X \ f3J 
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see [20] PP- 87 or Example 8.10. of From the point of view of the 

marginal densities, /? is a scale parameter and a is a shape parameter. In 
terms of the jump activity, a controls the overall activity of the jumps, while 
13 takes charge of the heaviness of the Levy density tail, and hence, of the 
frequency of big jumps. Notice that changes in the time units is statistically 
equivalent to changes of the parameter a, while changes in the units at 
which the values of X are measured are statistically reflected on changes 
of the parameter [3. That is to say, the scaled process {cX{ht)}^~^Q is also 
a Gamma Levy process with shape parameter ah and scale parameter (3c. 
This property is consistent with the previous remark on a taking charge of 
the jump activity and on (3 taking charge of the frequency of large jumps. 

7.2.2 The simulation procedure 

Simulation schemes based on series representation are used to generate Gamma 
Levy processes. Such procedures allow to retrieve a sample of the jumps of 
the process. Concretely, following |3T|, the process 

X{t) = /jf^y.exp (-^ 1[U, < t], (7.9) 
1=1 ^ ' 

is a Gamma Levy process on [0, T] with shape parameter a and scale param- 
eter (3 provided that {rj}j>i is a homogenous Poisson process with intensity 
I7 {^i}i>i are independent exponential r.v. with mean 1, {Ui}i>i are i.i.d. 
uniformly distributed on [0,T], and these three series are mutually indepen- 
dent. Below, we shall truncate the series to n terms in order to generate a 
sample path, and in particular, to approximate the jump process of X by 

n 

Jn(-) = 5^V.A)(-), (7.10) 

i=l 

where Ji = (3Viexp (-^). 

7.2.3 The numerical results 

We now present a few examples to illustrate the technique of projection 
estimation on histogram functions based on regular partitions (see Section lTiT] 
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for the specifications of the estimation method). Figure ^ shows the Gamma 
Levy density with a = 1 and f3 = 1, and the penahzed projection histogram 
of the form ()7.3p . The estimation is based on 2000 jumps of the Gamma 
Levy process on [0,365], and their resulting Poisson integrals obtained by 
using ()7.10|) instead of J'. The least-square method ()7.6p . taking the Xi s as 
the mid points of the partition intervals, yields the estimators a = 0.932 and 
/3 = 1.055. The maximum likelihood estimators based on the increments of 
the sample path of time length 1 are 1.015 for a and 0.949 for f] (we do not 
observe real improvement if the time length of the increments is reduced). 

In the next simulation, we consider a Gamma density with a lighter tail {f3 = 
0.5) and more jump activity (a = 2). The opposite setting was also studied: 
a heavier tail determined by a /3 = 2 and a lower jump activity given by an 
a = 0.5 (see Figures |21 and EI) • In the first scenario, the least-square method 
estimators are a = 1.907 and (3 = 0.472, while the maximum likelihood 
estimators are 1.924 and 0.527, respectively. For this second Gamma density, 
the least-square method (j7.5|) . taking the the midpoints of the partition 

intervals, produce estimators a = 0.5 and (3 = 1.72, while the maximum 
likelihood estimators are 0.55 and 1.99, respectively. 

Approximate histogram estimation on regular partitions is less successful in 
case of high activity levels. This problem is particularly evident when we have 
in addition heavy tails in the Levy density. For instance, if a = 3 and /3 = 3, 
the method requires a large sample size to satisfactorily retrieve the behavior 
around the origin (see Figures El and Ej) . For 2000 jumps, the least square 
estimates are a = 1.87 and (3 = 4.45, while the estimates are a = 2.8893 and 
/3 = 2.9268 for twice as many jumps. The maximum likelihood estimators 
based on the increments of time length 0.5 are 2.4134054 for a and 3.30971 for 
/3 when the approximating process is made out of 2000 jumps, while when 
the process is approximated using 4000 jumps, these estimates are 2.8281 
and 3.1007 for a and (3, respectively. We also notice in our experiments that 
the estimates for the first simulation improve considerably if the window of 
estimation is taken "far away" from the origin (for example, a = 3.20944 
and f3 = 2.68775 on [a,b] = [1.5,5]; see Figure IHl). 

7.2.4 Regularized projection estimation around the origin 

We present another way to estimate the Gamma Levy density even around 
the origin based on the regularization technique described in Section |21 The 
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key observation is the following: the Gamma Levy measure (|7.8|) can be 
written as 

i'{dx) = axexp vi^x), (7-11) 

where 7]{dx) = -^dx. Then, s{x) = aa^exp j is square integrable with 

respect to r], opening the possibility to use the projection estimation of s 
on a linear space S of ((0, oo),r/). Once an estimator s for s has been 
obtained, p defined by p{x) = s{x)/x'^ can work as an estimator for the Levy 
density p{x) = aexp {—x/ f3)/x. In the terminology introduced in Section 
121 ?7 is a regularizing measure for the Gamma Levy density p, and s is the 
corresponding regularized Levy density (see Definition 12 .111 . 

Let us specify this method for the linear model 

{m 
f{x) = CiXX[xo,xi){x) + ^ Cj X[x„xi+i){x) : Ci, . . . , Cm e ] 
i=2 

where C : = Xq < Xi < ■ ■ ■ < Xm = b is a. partition of a chosen interval 
D = [0,6]. The projection estimator, say sc, onto Sq, under the standard 
inner product of ((0, oo), 77), takes on the value 

scix) = x^ Yl W < ' 

^ t<T 

if X < a^i, while if Xi^i < x < Xi, for some i G {2, . . . , m}, then 

Sc{x) = ^^-1^^ J(^^Xi^i,Xi)). 

T[xi -Xi_i) 

We shall use the penalty function of Theorem 13.31 part (b) to perform the 
model selection procedure. That is, among different partitions C that satisfy 

n f 1 XmXra-l \ ^ rj. 

Dc = max <^ — , — - — , . . . , '> < T, 

^ Xl X2 Xi Xyyi Xyyi^l J 

we choose the projection estimator sc that minimize 

1 Xi — Xi-i 



i=2 



^ t<T: ^ 



( \ 

t<T: 

\AX(t)<xi / 
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The previous formulas are found directly from the definitions and results 
given in Section |21 (see for instance formulas (|2.9|) . (|2.1(J|) . (|3.4p . and (j3.5p ). 

Remark 7.1 Observe that the previous procedure is appropriate to estimate 
the density function s{x) = ^exp(— |) around the origin as far as 

d^-^5^AX(t)I[AX(t)<xi], 

t<T 

is a good estimator of a. It is not hard to check that the bias of a tend to 
zero as Xi \ 0. However, the variance of a converges to suggesting that 
the method works better when T is "large" and a is "small". 

We apply the above method to the simulated Levy process used in Figure 
i.e. a Gamma process with a = 1 and j3 = 1. Figure U\ shows the 
estimator p2{x) = s{x)/x^ and the actual Levy density p{x) = exp{—x)/x 
for X G [0.02,1] (we used regular partitions on [0,1]). From Figure [TJ the 
improvement is notorious, and moreover, we accomplish a good estimation 
around the origin of P2{x) = 0.9/a;, for x G [0, 0.2). 

This regularization procedure was also applied to the simulations of the 
Gamma Levy processes with {a = 3, (3 = 3) and with {a = 1/2, f] = 2). 
The results are plotted in Figures |H1 and IHl below (compare with Figures |21 
and El). We observe an improvement on both sample data. For instance, for 
a = (3 = 3, the nonparametric estimator s{x)/x'^ combined with a method of 
least-squares errors estimate a by 2.7296 and (3 by 3.2439. Similarly, when 
a = 0.5 and (3 = 2, least-square errors estimates a = 0.4825 and (3 = 2.1131. 

7.2.5 Performance of projection estimation based on finitely many 
observation 

In this part, we study the performance of the (approximate) projection es- 
timators introduced in Section and formally stated in Section IHl Namely, 
the method obtained by approximating the Poisson process of jumps by 

n 

1=1 
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where Jj is the i'^^ increment of X from t^_^ to and = iT/n. The time 
span between increments is denoted by At = T/n. Again, the considered 
estimators are histograms as defined in Section 17.11 and apphed in Section 

EH 

Table ^ compares the (approximate) penahzed projection estimators with 
least-square errors (PPE-LSE) to the maximum likelihood estimators (MLE) 
for the Gamma Levy process with a = f3 = 1 using different time spans At. 
We also consider two types of simulations: jump-based and increment-based. 
The method based on jumps uses series representation with n = 36500 jumps 
occurring during the time period [0, 365] (notice that if we think of 365 as 
days, the number of jumps corresponds to a rate of about 1 jump every 
5 minute). The increment-based method is a discrete skeleton with mesh 
of 0.001. Notice that maximum likelihood estimation does not do well for 
small time spans when the approximate sample path is based on jumps. 
On the other hand, penalized projection estimation does not provide good 
results for long time spans when the approximate sample path is based on 
increments. The sampling distributions of the MLE for a and f3 are shown in 
Figures IT^ and IT^ in the case of At = 0.1. On the other hand, the sampling 
distributions of the estimates for a and (3 obtained from fitting the PPE are 
given in Figures IT^ and IT31 Even though, the MLE are much more superior, 
the estimates based PPE have good performance considering that they are 
model-free. 





Jump-based Simulation 


Increment-based Simulation 


At 


PPE-LSE 


MLE 


PPE-LSE 


MLE 


1 


1.01 


1.46 


.997 


.995 


.73 


1.78 


1.09 


.99 


0.5 


1.03 


1.09 


.972 


.978 


.9 


1.49 


1.01 


1.06 


0.1 


.944 


.995 


1.179 


.837 


.923 


1.03 


.989 


1.09 


0.01 


.969 


.924 


6.92 


.5 


.955 


1.019 


.9974 


1.083 



Table 1: Estimation of a Levy Gamma process with a = (3 = 1. Two types of 
simulation are considered: series-representation based and increments-based. 
The estimations are based on equally spaced sampling observation at the time 
span At. Results for the approximate penalized projection estimators with 
least-squares errors, and for the maximum likelihood estimators are given. 
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7.3 Estimation of variance Gamma processes. 



7.3.1 The model 

Variance Gamma processes were proposed in ^H] as substitutes to the Brow- 
nian Motion in the Black-Scholes model. There are two useful representations 
for this type of processes. In short, a variance Gamma process X = {X{t)}t>o 
is a Brownian motion with drift, time changed by a Gamma Levy process. 
Concretely, 

X{t) = eU{t) + aW{U{t)), (7.12) 

where {W{t)}t>o is a standard Brownian motion, 6 ^ M., a > 0, and U = 
{U{t)}t>o is an independent Gamma Levy process with density at time t 
given by 

Ux) = , ,\ 7.13 

Notice that E [U{t)] = t and Var [U{t)] = ut; therefore, the random clock 
U has a "mean rate" of one and a "variance rate" of u. There is no loss of 
generality in restricting the mean rate of the Gamma process U to one since, 
ClS db matter of fact, any process of the form 

e^V{t) + a^W{V{t)), 

where V{t) is an arbitrary Gamma Levy process, 6i G M, and Ui > 0, has 
the same law as a process of the form ()7.12|) with suitably chosen 6, a, and 
u. This a consequence of the self- similarity^ property of Brownian motion 
and the fact that u in (j7.13p is a scale parameter. 

The process X is itself a Levy process since Gamma processes are subordi- 
nators (see Theorem 30.1 of [33]). Moreover, it is not hard to check that 
"statistically" X is the difference of two Gamma Levy processes (see 2.1 of 

m): 

{X(t)}i>o = {X+(t) - X_(t)}i>o, (7.14) 

where {X+(t)}t>o and {X_(t)}i>o are Gamma Levy processes with respective 
Levy measures 

i>±{dx) = aexp ( — — ) dx, for a; > 0. 
V P±J 

^namely, {W{ct)}t>a = W/^W{t)}t>o, for any c > 0. 
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Here, a = Iju and 



/9± 




+ 



2 




As a consequence of this decomposition, the Levy density of X takes the 
form 



where a > 0, /3_ > 0, and > (of course, > 0). As in the case of 

Gamma Levy processes, a controls the overall jump activity, while (3+ and 
(3- take respectively charge of the intensity of large positive and negative 
jumps. In particular, the difference between 1/(3+ and 1/(3- determines the 
frequency of drops relative to rises, while their sum measures the frequency 
of large moves relative to small ones. 

7.3.2 The simulation procedure 

The above two representations provide straightforward methods to simulate 
a variance Gamma model. One way will be to simulate the Gamma Levy 
processes {X+(t)}o<t<T and {X_(t)}o<f<T of ()7.14|) using the series repre- 
sentation method of Section 17.2.21 The other approach is to first generate 
random time change {t/(t)}o<t<T of (j7.12j) . and then construct a discrete 
skeleton from the increments X(iAt) — X{{i — l)At), i>l. The increments 
of X are simply simulated using normal random variables with mean and 
variances determined by the increments of U . 

7.3.3 The numerical results 

Notice that, from an algorithmic point of view, the estimation for the variance 
Gamma model using penalized projection is not different from the estimation 
for the Gamma process. We can simply estimate both tails of the variance 
Gamma process separately. However, from the point of view of maximum 
likelihood estimation (MLE), the problem is numerically challenging. Even 
though the marginal density functions have closed form expressions (see jTHI ) , 
there are well-documented issues with MLE (see for instance 1211 )• The likeli- 
hood function is highly flat for a wide range of parameters and good starting 
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values as well as convergence are critical. Also, the separation of parameters 
and the identification of the variance Gamma process from other classes of 
the generalized hyperbolic Levy processes is difficult. In fact, difference be- 
tween subclasses in terms of likelihood is small. It is important to mention 
that these issues worsen when dealing with "high-frequency" data. 

Let us consider a numerical example motivated by the empirical findings of 
[TB] based on daily returns on the S&P stock index from January 1992 to 
September 1994 (see their Table I). Using maximum likelihood methods, the 
annualized estimates of the parameters for the variance Gamma model were 
reported to be ^ = -0.00056256, = 0.01373584, and z/ = 0.002, from 
where we obtain a = 500, (3+ = 0.0037056, and /3_ = 0.0037067. Figures CUl 
and ^2 show respectively the left- and right- tails of the Levy density and 
their penalized projection estimators as well as their corresponding best- 
fit variance Gamma Levy densities using a least-square method, and their 
marginal probability density functions (pdf) scaled by 1/At (the reciprocal 
of the time span between observations). The estimation was based on 5000 
simulated increments with At equal to one-eigth of a day. The figures seem 
quite comforting. To get a better picture. Figures ITHl and IT7I show the sam- 
pling distributions of the estimates for a_ and /5+ obtained from applying the 
least-square method to the penalized proyection estimators. The histograms 
are based on 1000 samples of size 5000 with At = 1/8 of a day. This ex- 
periment shows clear, though not critical, underestimation of the parameter 
a and overestimation of the parameters /5's. A simple method of moments 
(based on the first four moments) yields better results (see Figures UHl and 
IT^ . Nonparametric methods are not free- lunches and usually the gain in 
robustness is paid by a lost in precision. 



8 Concluding Remarks 

• In the present paper we have developed a new methodology for the es- 
timation of the Levy density of a Levy process. Our methods are quite 
fiexible in the sense that different type of estimating functions can be 
used; for instance, histograms, splines, trigonometric polynomials, and 
wavelets. The estimation is model free, easily implement able, and suit- 
able for "high-frequency" data. We prove that, based on continuous- 
time data, our procedures enjoy good asymptotic properties. Oracle 
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inequalities imply that, up to a constant, the procedure will achieve 
the best possible risk among the projection estimators. Moreover, it 
is proved that penalized projection estimators on splines achieve the 
optimal rate of convergence, from the minimax point of view, on some 
classes of smooth Levy densities. Simulations show good results in 
Levy models with infinite jump activity such as the variance Gamma 
model. 

• Generalization of our procedures and results to some multivariate Levy 
models can be readily obtained, since the results behind our construc- 
tion have multivariate versions. Indeed, the Levy-Ito decomposition of 
the sample paths, the concentration inequalities for compensated Pois- 
son integrals, the inference theory for Locally Asymptotically Normal 
distributions, and the short-term properties of the marginal distribu- 
tions are valid in the multivariate setting. More precisely, consider a 
Levy process X = {X.(t)}t>o on ffi'' with Levy measure z/. Assume that, 
on a window of estimation D G i3(R°'\{0}), z/ is absolutely continuous 
with respect to a reference measure rj and that s = dv/drj is bounded 
with also Jj-^ s'^{x)7]{dx.) < oo. Then, given a finite-dimensional sub- 
space S of {{D, rj)), the projection estimator of s on 5 is defined as 
in Section 121 with J' being the Poisson measure on x M.'^ associated 
with the jumps of X. Similarly, penalized projection estimators can be 
constructed, and the risk bound of Theorem 13. 3[ along with the Ora- 
cle inequality ()3.8p . are satisfied. The results of Sections El and IHl are 
valid as well. However, let us point out that further specifications of 
our methods for some semiparametric models are desirable. Important 
examples of these models include multivariate stable processes, and the 
tempered stable Levy processes, recently introduced in [32j. 

• We have concentrated here on the estimation of the jump part of the 
Levy process. It is natural to address the problem of estimating the 
continuous part too. In the one-dimensional case, this part is of the 
form bt + aW{t), where {W{t)}t>Q is a standard Brownian motion. In 
the multivariate case, it is characterized by a vector b and a symmetric 
nonnegative-definite matrix S. There are several approaches to deal 
with the estimation of S, from moment based methods to methods 
based on high-frequency data. A simple approach is to use the following 
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functional limit: 



where {Y{t)}t>o is a centered Gaussian Levy process with variance- 
covariance matrix S. This result can be deduced from the proof of the 
uniqueness of the Levy-Khintchine representation as in pp. 40 of 
Another simple method will be to consider empirical versions of the 
moments: 

E [(X,(t) - Xi{t)){Xk{t) - Xk{t))] = t I Ei,fc + f XiXkiy{d^ 

provided that i||x||>i < ^ (see Section 25 jSl]). Here, Xi{t) 

and Xi refer to the i^^ component of the vectors X(t) and x, respectively, 
while Ejj- is the entry of S. The second term on the left hand 
side of the above expression can be estimated using our estimators for 
u. In the one-dimensional case, another approach is to use "threshold 
estimators" of the form: 



Y,i^kXfl{{AkXf<rih)) 



k=l 

where A^X = X(t^) — X{t^_{) is the i^^ increment of the process 
and r{h) is an appropriate cutoff function (see |2Z] for details). For 
a class of semimartingales with finite jump activity, [H] provides other 
methodology based on the bipower variation (see also jTj). In the case 
of Levy processes with finite jump activity, P disentangles the difussion 
from jumps using maximum likelihood and the Generalized Method of 
Moments. On the other hand, the estimation of the parameter b can 
be done by different methods. For instance, using the empirical version 
for 



E [X(t)] =t(h + J xz/(dx)^ 



valid if i||x||>i ll^ll'^l'^^) < Another approach will be to estimate the 
"drift" bo = b — i||x||<i xz/((ix) (where the integration is component- 
wise) using the fact that 



P 



lim -X(h) = bo 

h^O h 



1. 
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The above result holds true if ii|x||<i ll^ll^l*^^) < °° S = (see 
^^)- Even though our methods are valid for non necessarily pure- 
jump Levy processes, it is expected that the presence of a diffusion 
component will reduce the efficiency (in terms of speed of convergence 
and accuracy) of our estimators. It would be interesting to study in 
greater detail this phenomenon. 



9 Main Proofs 

9.1 Proof of the risk Bound 

We will break the proof of Theorem 13.31 into several preliminary results. 

Lemma 9.1 For any penalty function pen : — > [0, oo) and any m G Ai, 

the penalized projection estimator s satisfies 

""^ <\\s- s^f + 2x1 + 2z/D {si - si) + pen(m) - pen(m), (9.1) 



s 



where xi = ll-^m ^ -^mP ^'^^ where the functional vo : L"^ {{D,ri)) —>■ M is 
defined by 

Mf) ^ [f m ^idt^dx)-s{x)dt^{dx)_ 2) 



[0,T]xD 

The general idea to deduce ()3.5p is to bound the unattainable terms of the 
right hand side of ()9.1|) (namely x%. {si — s^) ) by observable statis- 

tics. Then, the form of pen(-) will be determined by this observable statistics 
so that the right hand side in (j9.ip does not involve rh. To carry out this 
plan, we use concentration inequalities for and for the compensated Pois- 
son integrals uoif)- The following result gives a concentration inequality for 
general compensated Poisson integrals. 

Proposition 9.2 Let N be a Poisson process on a measurable space (V, V) 
with mean measure fi and let f : V ^ M. be an essentially bounded measurable 
function satisfying < jy f'^{v)fi{dv) and \f{v)\fi{dv) < oo. Then, for 
any u > 0, 



P 



l'^f{v){N{dv)-fi{dv)) > ||/||^,(^)v^+l||/||^« 



< e-", (9.3) 
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where ||/||^2(^) = fy P{'i^)fJ'{dv) . In particular, if f V ^ [0, C)o) then, for 
any e > and u > 0, 

P 



^u] > I f{v)^i{dv) 



(9.4) 



For a proof of the inequality ()9.3|) . see (Proposition 7) or [21j (Corollary 
5.1). Inequality ()9.4p is a direct consequence of ()9.3|) (see Section 1^731 for a 
proof). 

The next result allow us to bound the Poisson functional Xm- This results is 
essentially Proposition 9 of [HO] . 

Lemma 9.3 Let N be a Poisson process on a measurable space (V, V) with 
mean measure fi{dv) = p{v)r]{dv) and intensity function p G L^(V, V, 77). 
Lets be a finite dimensional subspace of L'^(y,V,r]) with orthonormal basis 
{ifi, . . . , (fd}, and define 



(fi{w)N{dw) I ipi{v 



w)Lpi{w)r]{dw) ] ipi{v). 



(9.5) 
(9.6) 



Then, X^{S) = \\p ~ P'''IIl2(^) is such that for any u > and e > 



where we can take k = 6, k{e) = 1.25 + 32/ e, and where 



<e-\ 



Ms ^ sup I f\v)p{v)r^{dv) :feS, Uh^,) = l\, 
55 = sup{||/|U:/G5,||/|U.(,) = l}. 



(9.7) 

(9.8) 
(9.9) 



Following the same strategy as jHOI, the idea is to deduce a concentration 
inequality of the form 

P [lis - 5f < C - sif + pen(m)) + h{0] > 1 - CV^, 
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for constants C and C", and a function h{^) (all independent of m). This 
will prove to be enough in view of the following result (see Section 19.31 for a 
proof). 

Lemma 9.4 Let h : [0, oo) —>■ IR+ be an strictly increasing function with 
continuous derivative and such that h{0) = and lim^^oo e~^/i(0 = 0. If Z 
is random variable satisfying 

F[Z> h{^)] < Ke~^, 

for every ^ > 0, then 

/■oo 

EZ <K e-''h{u)du. 
Jo 

We are now in position to prove the main result of this section. Throughout 
the proof, we shall have to introduce various constants and inequalities that 
will hold with high probability. In order to clarify the role that the constants 
play in these inequalities, we shall make some conventions and give to the 
letters x, y, /, a, b, ^, /C, c, and C, with various sub- or superscripts, special 
meaning. The letters with x are reserved to denote positive constants that 
can be chosen arbitrarily. The letters with y denote arbitrary constants 
greater than 1. /, /i,/2,... denote quadratic polynomials of a variable ^ 
whose coefficients (denoted by a's and b's) are determined by the values 
of the x's and y's. The inequalities will be true with probabilities greater 
that 1 — /Ce~^, where /C is determined by the values of the x's and the y's. 
Finally, c's and C's are used for constants constrained by the x's and y's. It 
is important to remember that the constants in a given inequality are meant 
only for that inequality. The pair of equivalent inequalities below will be 
repeatedly invoked through the proof: 

(i) 2ab < xa^ + and , . 

(ii) {a + bf <{l + x)a'+{l + l)b\ (for x > 0). 

Proof of Theorem 13.31 We consider successive improvements of the in- 
equality ()9.1|) : 

Inequality 1: For any positive constants Xi, X2, x^, and X4, there is a positive 
number K, and an increasing quadratic function f{^) (both independent of 
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the family of linear models and ofT) such that, with probability larger than 
1 - /Ce-«, 



2 + 2xi + 2xi||4-sf "2 



"ml 



(9.11) 



+ pen(m) — pen(m) + ^ 



Verification: Let us find an upper bound for ud {s^, — s^), rn',m G 
Since tlie operator ud defined by (19 .211 is just a compensated integral witfi 
respect to a Poisson process witli mean measure fi{dtdx) = dtr]{dx), we can 
apply Proposition 19.21 to obtain that, for any x'^, > 0, and with probability 
larger than 1 — e~^™' 



{si, - si) < 



T 



s„. - s 



r' 



3T 



(9.12) 



In that case, the probability that ()9.12j) holds for every m' G is larger 
than 1 — Xlm'eA^ because P{AnB) > 1 — a — b, whenever P{A) >l — a 
and P{B)>l-b. Clearly, 



T 



2 



T 



s{x)dtrj{dx) 



[0,T]xD 
< II sll ■ 



T 



Using ()9.10l -i). the first term on the right hand side of ()9.12|) is then bounded 
as follows: 



T 



for any Xi > 0. Using and ()9.10l -i) 



) -^m' 



< 



Dm' \\Sm' I 



< D.n,>\\s\\x'^, + y^D.J\s\\xi, 

< 3X2-Dm' + SXsDm 



I 5 1 1 X 



12 



(9.13) 



1 1 
- + - ) > 

X2 
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for all X2 > 0, X3 > 0. It follows that, for any X\ > 0, X2 > 0, and X3 > 0, 



f ± ±\ ^ U ± -L||2| -^m I 

I ll-^lloo-^m' _|_ 1 1 1 1 m' 

2Txi 36Tx ' 



where we set 4 = — + — . Next, take 



X'^, = X4,\/dm' ( 77^ A ] + ^. 



Then, for any positive xi, X2, X3, and X4, there is a /C and a function / such 
that, with probability greater than 1 — }Ce~^, 



{si, - si) < X,\\si, - siW^ + X2^ + X3^ 

Concretely, 



(9.14) 



(9.15) 



Here, we use the assumption of poljTiomial models fDefinitior l3.1|) to come 
up with the constant JC. Pluging (j9.14j) in (j9.ip . and renaming the coefficient 
of djn'/T, we can corroborate inequality 1. 

Inequality 2: For any positive constants yi > 1, X2, x^, and X4, there are 
positive constants Ci < 1, C[ > 1, and K,, and a strictly increasing quadratic 
polynomial f ( all independent of the class of linear models and T ) such that 
with probability larger than 1 — lCe~^ , 

C,\\s-~sr < C[\\s - sir + y,xl 

+X2^ + x^R^ + Xi^ (9.16) 
+ pen(m) — pen(m) + 

Moreover, if 1 < yi < 2 , then C[ = 3 — yi and Ci = yi — \. If yi > 2, then 
C[ = 1 + Axi and Ci = 1 — Axi, where xi is any positive constant related to 
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/ according to equation i9.1^} . 

Verification: Let us combine the term on the left hand side of ()9.11|) with 
the first three terms on the right hand side. Using the triangle inequality 
followed by (jllMii), 

He-!: _ ||2 < 211 s — s"*- + 211 — sll^ 

IPm ■'mil — ^IP ■'mil ^^^Ih^m ''II ■ 

Then, since xli = ll^i - s^P, and - s\\^ = \\s - Srh\\^ - ||s4 - SrhW^, it 
follows that 



l-^ ^m\\ "I" "I" 2a;i||s^ s^|| ||s s|| 

2 I /o S 112 



< (1 + 4xi) ||s - sif + (2 - Axi) 114 - s^, 
+ (4xi - 1) ||s - s 



m I 

nl|2 

1 



for every Xi > 0. Then, for any yi > 1, there are positive constant C, C[ > 1, 
and Ci < 1 such that 

< r"lle e-L II 2 I „ ,.2 II e 3l|2 W'-^'J 

S — S,^\\ + — — S|| . 

Combining ^HT^ and (jHUZj), we obtain iHE^ . 

Inequality 3: For any ?/2 > 1 (in-d positive constants Xi, i = 2,3, 4, there exist 
positive numbers Ci < 1, C[ > 1, an increasing quadratic polynomial of the 
form /2(0 = + ^'^^ ^ constant /C2 > (all independent of the family 
of linear models and ofT) so that, with probability greater than 1 — /C2e~^, 



c,\\s-~sr <c[\\s-sir 

+1/2^ + 2:2^ + 3:3^ -pen(?fi) (9.18) 
+X4^ + pen(m) + 

Verification: We bound Xm' using Lemma 1^731 with V = M+ x D and /i((ix) = 
s{x)dtrj{dx). We regard the hnear model 5^ as a subspace of L^(M+ x 

D, dtr]{dx)) with orthonormal basis • • • , ^^^| • Recall that 



2 — II -L _ " ||2 _ 

i=l 



, .J{dt,dx) — s{x)dtr]{dx) 

'■Pi,m [^) 
[0,T]xD ^ 
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Then, with probabihty larger than 1 — J2m'&M ^ ^" 



VTxm' < (1 + xi) + ^2kMm'x'^, + kixi)B^,x'^„ (9.19) 



for every m' E M., where Bm' = \/ D^' /T, 

/ dm \ 

= j^\yi^lrn{^)\<^)^id^)^ and (9.20) 

Mm' = sup jy f{x)s{x)r]{dx) : f E Sm' 

Since p{x)s{x)r]{dx) < ||/||oo||s||, M^,/ is bounded above by V-D^'- In 
that case, we can use ()9.10l -i) to obtain 

fcllsll 

^2kMm'x'^, < X2VD^' + -^x'^„ 

ZX2 

for any X2 > 0. On the other hand, by hypothesis Dm' < T, and ()9.19p 
imphes that 

V^Xm' < (1 + xi)a/V^ + X2\fD^i + {^^^ + ^(a;i)^ a;'^', 

where the constants x'^i are chosen as 

/ x^^J dfYi' 

Then, for any Xi > 0, X2 > 0, X3 > 0, and ^ > 0, 



^Xm' < {I + Xi)^V^, + X2^D^' + x^^/d^. + h{i), (9.21) 
with probabihty larger than 1 — /Cie~^, where 

/i(0 = (S + M^i)) 
^1 = rEr=i^'^exp (-7^x3/ (1^ + M^i))) . 

Squaring ()9.2ip and using ()9.1UI -ii) repeatedly, we conclude that, for any 
y > 1, a;2 > 0, and 0:3 > 0, there are both a constant /Ci > and a quadratic 
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function of the form /2(0 = '^^^ (independent of T, m', and the family of 
hnear models) such that, with probability greater than 1 — /Cie~^, 

Xm' +^2^ + X3— + MmeM. (9.23) 

Then, flOT^ immediately follows from and (jnHEl). 

Proof of \3. 5)) /or case (^cj.- 

By the inequahty ()9.4|1 . we can upper bound Vm' by Vm' on an event of large 
probability. Namely, for every x'^, > and x > 0, with probabihty greater 
than 1 - ^m'eM e""™' 

(1 + x) (Vm' + + ^^'m^ > Vm', Vm' G M, (9.24) 

(recall that Dm = \\ Yli=i V^^mlloo)- Since by hypothesis Dm' < T, and choos- 
ing 

X'm' = X'dm' + ^, {X > 0), 

it is seen that for any x > and > 0, there are a positive constant /C2 and 
a function f{C,) = bC, (independent of T and of the linear models) such that 
with probability greater than 1 — }C2e^^ 

(1 + x)Vm' + x^dm' + fiO > Vm', Vm' G M. (9.25) 

Here, we get /C2 from the Polynomial assumption on the class of models. 
Combining ()9.25|1 and ()9.18|1 . it is clear that for any 1/2 > I, and positive Xi, 
i = 1,2,3, we can choose a pair of positive constants Ci < 1, C[ > 1, an 
increasing quadratic polynomial of the form f{C,) = a^"^ + b^, and a constant 
/C > (all independent of the family of linear models and of T) so that, with 
probability greater than 1 — /Ce~^ 



Ci||s-Sf <C[\\s-s 



m I 



+Z/2^ + + X2^ -pen(m) (9.26) 
+X3^ + pen(m) + 

Next, we take ?/2 = c, Xi = c', and X2 = c" to cancel —pen{rh) in ()9.26|) . By 
Lemma [9.41 it follows that 

C^E [\\s - Sf] <C[\\s- sif + (1 + ^) E [pen(m)] + ^. (9.27) 
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Since m is arbitrary, we obtain the case (c) of (|3.5|) . 
Proof of i!J.5}) for case (a): 

By Remark 13.21 we can bound Vm', as given in ()9.20|) . by Dm'P (assuming 
that p < oo). On the other hand, ()9.4j) imphes that 

with probabihty greater than 1 — e"^. Using these bounds for Vm' and the 
assumption that Dm' < T, ()9.18|) reduces to 

Ci\\s-rsr <c;||.-.^f 

+y^fT^ + xi^ -pen{m) (9.29) 
+X2-^# + pen(m) + 

which is vahd with probabihty 1 — /Ce~^. In fj9.29|) . y > 1, Xi > and X2 > 
are arbitrary, while Ci, C[, the increasing quadratic polynomial of the form 
fiO = '^^^ + a constant /C > are determined by y, xi, and X2 

independently of the family of linear models and of T. We point out that 
we divided and multiplied by p the terms D^/T and Dm/T in ()9.18p . and 
then applied ()9.28|) to get ()9.29|) . It is now clear that y = c, and Xi = d will 
produce the desired cancelation. 

Proof of i3. 5)) for case (h): 

We first upper bound by P'^Vm and c?^ by {P(f))~^Vm in the inequality 
(EIHl): 



Ci\\s - 5f < C[\\s - sif + {y + x,l3-' + X2if3(l))-') ^ 
— pen(m) + x^jS'^^ + pen(m) + 



(9.30) 



Then, using dm' < {P4>)~^Vm' in ()9.25|1 and letting X4{P(f))^^ vary on (0, 1), 
we verify that for any x' > 0, a positive constant /C4 and a polynomial / can 
be found so that with probability greater than 1 — /C4e~^, 

il + x')Vm' + fiO>Vm', WmeM. (9.31) 

Putting together ()9.31|) and ()9.3()j) . it is clear that for any y > 1 and xi > 0, 
we can find a pair of positive constants Ci < 1, C( > 1, an increasing 
quadratic polynomial of the form /(^) = a^^ + and a constant /C > (all 
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independent of the family of linear models and of T) so that, with probability 
greater than 1 — /Ce~^, 

Ci\\s-sf < C[\\s - sif + - pen{m) 

+ pen(m) + 

In particular, by taking y = c, the term —pen{m) cancels out. Lemma 19.41 
implies that 

CiE - sf] <C[\\s- sif + (1 + xi) E [pen(m)] + ^. (9.33) 
Finally, ()3.5|1 (b) follows since m is arbitrary. □ 



Remark 9.5 Let us analyze more carefully the values that the constants C 
and C can take in the inequality / I,?. ,5]) . For instance, consider the penalty 
function of part (c). As we saw in \9.21 ), the constants C and C are de- 



termined by Ci, C[, C'l, and x^. The constant Ci was proved to be yi — I if 
1 < yi < 2, while it can be made arbitrarily close to one otherwise (see the 
comment immediately after 119. If^) ). On the other hand, yi itself can be made 
arbitrarily close to the penalization parameter c since c = y2 = yi{l + x)y, 
where x is as in \9.24\l and y is in \9.2'^) . Then, when c > 1, C\ can be 
made arbitrarily close to one at the cost of increasing C" in \9.21\) . Simi- 



larly, paying the same cost, we are able to select C[ as close to one as we 
wish and x^ arbitrarily small. Therefore, it is possible to find for any e > 0, 
a constant C'{e) (increasing in e) so that 

ms-sf<{l + e) inf {||s-s;^f + E[pen(m)]| + ^^. (9.34) 

meM T 

A more thorough inspection shows that 

\imC'{e)e = K, 

e—*0 

where K depends only c, d , c" , T, R, and ||s||oo- The same reasoning 
apply to the other two types of penalty functions when c > 2. In particular, 
we point out that C can be made arbitrarily close to 2 in the Oracle inequality 
\3. at the price of having a large C constant. 
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9.2 Proof of the minimax results 



Proof of Theorem 14. H 

Fix a Levy density Sq G 6^ {L/2; [a, b]) such that So{x) > 0, for all x G [a, b], 
and a constant n > 0. Consider a bounded function g : ^ M+, with 
compact support K C [—1,1], that meets (j4.2|) with L/2 (instead of L) for 
all Xi,X2 G M, g{0) > 0, increasing for x < 0, and decreasing for x > 0. 
Moreover, the support and the maximum value of g are chosen small enough 
so that 

So{x) — K,^°'g {n{x — Xq)) > 0, Vx G [a, b], 

and the support of g {k{x — Xq)) is contained in (a, b). Let us consider the 
parametric model 



a / 1 

S0{x) := So(a;) + OT'^^^g IkT^^^x - Xq] 



X G Mo, 



parametrized by G {—k Notice that the function is a Levy 

density for any T > 1 and |^^| < k^". Now, for Xi, X2 G [a, 6], 



(^■)(X0-4'^^(X2)|< S^^>{X,)-S'^>{X,) 



+ 



I — CK + fc 



gik) (^^T^(xi - xo)) - g^'^ («:T^(X2 - Xo)) 

< ^^1^^ _ + -\0\n'+PT^^\x, - x,f 

< L\xi — X2 



1/3 

I 5 

implying that 56) G O whenever |^| < 

Let A^o be the space of atomic measures on [0, T] x [a, b] and let be 
the probability measure on A4q induced by those jumps of the Levy process 
whose sizes lie on [a, b] and where the Levy density of the process 

is Sg. In other words, Pg is the distribution of a Poisson process on [0, T] x 
[a,b] with mean measure dtSg{x)dx. Using Theorem 1.3 of |2S], 



' (0=exp|y J ln^l + eT-^So\x)g[T^{x-Xo)jj^{dt,dx) 



(T)' 
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The goal is to prove the LAN {local asymptotic normality) property for the 
parametric model jPg^'' : G (— at 6' = (see definition 2.1 of 

[25]). Now, define R{u) = ln(l + u) - u + ^. The right hand side of the 
above equation can be written as follows: 



where 



(0 = exp^A,--<+r,(^) 



A^=T~^ J J SQ^{x)g\T^{x-xo)j[^{dt,dx)-SQ{x)dtdx], 
al =T^~ f So\x)g^ (T^{x-xo))dx, 

J a 

r^{e) = J J SQ^{x)g^ \ T^{x -Xo)j [^{dt,dx) - So{x)dtdx] 

^ loL ^ (^T^^So ^(x)^ (t^{x - xo))) ^{dt, dx). 
We want to prove that there are nomalizing constants > such that 

£p(T) (v^tAJ ^ Ar(0,l), ^lal ^ 1, and r^{e) ^ 

as T — >■ cxD. To prove the first limit, we invoke the CLT for Poisson integrals 
by verifying the Liapunov condition (see Theorem 1.1 and Remark 1.2 of 
1^). For T > 1, we have that 



5 

T rb 



T-^&- j j s^'^-\x)g^+^ (^kT^{x - xo)^ {so{x)) dxdt = 
Jk 



Notice also that, for large enough T 

r-T pb 



Var(A^) = T~5lfT ^ J SQ^{x)g'^ (^kT^ {x - Xq)^ {so{x)) dxdt 

= K,^^T^^^^~2^ / Sq^{k^^T^^^u + xo)g'^ (u) du 
Jk 



T^CXD -1-1 



K Sq (xq) / g {u)du. 
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Then, C^^t) (A^) ^ A/'(0, Jg) with Jq := k ^So^(2;o) J^g'^{u)du. In partic- 



ular, we also prove that cr^ 
probability. Notice that the first term of has mean and variance 



Jq. We now verify that r^{6) vanishes in 



0^ „ 4a 

4 

e 



T 



SQ'^{x)g^ ( i^T'^°'+^ {x — xo) ) {so{x))dxdt 



_ -1 -1 _ 4a 1 

-'"2'' 2a + l 2a + l 



(k T 2«+im + xo)5' ('u)(i-u 



4 /„A j„. r-*^^ g 



Then, the first term converges in probability to 0. Similarly, the second 
term of rxiO) converges to in probability because its mean and variance 
both goes to 0. Indeed, using that < |mP/3, the absolute value of its 

expectation satisfies 



^ R(^9T'^SQ\x)g (^kT^ 



+1 (x — xq) ) ) {so{x))dxdt 



< 



-T 



1-7 



So^(x)/ (KT2a+i [x-xo] 



dx ^-i^ 0. 



A similar reasoning applies to the variance. Therefore, {Pg"^^}eg(_re-a^K-") 
is Locally Asymptotically Normal (LAN) at 6' = (with the normalizing 
constants (^t ■= ^o^)- We are now in position of using the theory for LAN 
families (see [22] for the general theory and for the case of Poisson pro- 
cesses). In particular, by (2.11) of j2Sl, if for each T > 0, 9^ is an arbitrary 
estimator of 6, based on the jumps of the Levy process happening on or 
before T and with sizes in [a, b], then 



lim inf sup Eg 



\e\<K' 



Jo 



> 5, 



(9.35) 



where B ■=E [ioiZ)x[\z\<ioK--/2]] and Z ~ J\f{0, 1). 

Now, for each T > 0, let s^, be an arbitrary estimator, based on the jumps of 
the Levy process happening on or before T and with sizes in [a,b]. Clearly, 
Sj, induces the estimator 9^ := g~^{0) (Sj,(xo) — Sq^Xq)) , and since 9 = 
T2a+i (^-1(0) (s6i(xo) — so{xo)) , we can write 

^(0) (9^ -9)= (s^(xo) - seixo)) . 
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If we take := £ {g{0)I^^u), (l!OH|l becomes: 



B < liminf sup Kg 



\e\<K- 



'0 I 4 



Since {se-.ee (-A;"", fc"")} C 6, 



/ a 

liminf sup Kg i I T2°=+i (Sy(xo) — sg{xo)) 



liminf sup Es i Ta^+i (Sj,(xo) - s{xo)) 

T^OO gfzQ L 



> 5, 



(9.36) 



(9.37) 



where 

B ■= 2-^^/2^-1/2 /■ £(^(0)/o-i^)e-"'/2rfz. 

J\z\<IoK-'^/2 

This implies ()4.3|1 because the lower bound B does not depend on the family 
of estimators s^. Indeed, for each e > 0, let s^^^ be such that 



sup Kg 

see 



£(r^ (^sP(xo)-s{xo))) 

<infsupEs i [T^ {s^{xo) - s{xo))] 
see L V / 



Taking the liminf as T ^ oo on both sides, we obtain (j4.3j) since e is arbi- 
trary. □ 

Proof of Corollary 14. 2t 

We first notice that the proof of Theorem 14. II can be modified so that ()9.36p 
holds true even if xq is not fixed. That is, for any family of estimators 
and points x^ e (a, b), 



lim inf sup E^ 



see 



-/ 2Q! -|- 1 ^ rp J 



s{x^)) 



>C, 



(9.38) 



for a constant C > 0, which is independent of the family of estimators and 
of the points. Indeed, we can construct a parametric model of the form 



so{x) + 9T 2<.+i^^ fi;T2-+i(x-Xy) , a; e Mo, 



where \6\ < n " and where is as in the previous proof. Moreover, without 
loss of generality, < infy (7^,(0) < sup j- (7^,(0) < oo, since Sq is continuous 
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and strictly positive on {a,b). Let be the distribution of a Poisson 
process on [0, T] x [a, b] with mean measure dtSg ^{x)dx. Following the same 

arguments as above, {P^ . 9 E (— is Locally Asymptotically 
Normal (LAN) at ^ = with the normalizing constants 



(f^ := K 



Sg (k T ■^"+^u + x^)g iu)du 



Observe that there is an m > for which miT'-PT > By (2.11) of |25j, for 

any 5 > 0, 



lim inf sup Eg 



\e\<&vj 



>C, 



(9.39) 



where C := E [^o(^)X[|z|<<5/2]] Z ~ A/'(0, 1). Since (Pj, >m and ^odz/l) is 
increasing in y, 



lim inf sup Eg 



|e|<5<pj 



>C, 



(9.40) 



Now, take 



9^ := T^(7-^(0) (s^(xj - so(xj) 



Since ^ = T2-+i^^i(0) {s,_^{x^) - So(x^)), 



sup E^ 
see 



i{g,iO) (9^-9 

Th(L-9 



> sup E51 

\e\<Sfj. 

> sup Kg 

where m = inf^, gj,{0). Taking lim inf as T — 00, ()9.38p is obtained with 



(9.41) 



|2|<5/2 



Finally, ()4.4|) can be deduced as follows. For each e > 0, let s^^^ E O and 
x'-^^ E (a, 6) be such that 



sup Es 

see 



< 



inf inf sup E^ 

x&{a,b) Sy see 



£ (s^(x)-s(x)) 



+ e. 
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Next, take the liminf as T ^ oo on both sides above, and apply (|9.38|) . 
Finally, let e ^ 0. □ 

Proof of Corollary 14. 3t 

Fix a measurable estimator and a s G 0. By Fubini's Theorem, 



(■Sj,(x) — s(x))^ dx 



Es — s(a;))^] dx. 



Now, for each e > 0, there exists an Xq^ G (a, h) satisfying 



h — a 
Then, 

sup E^ 
- a see 



s{x)) ] dx > E, 



(e) 
Sj, I Xq 



S X, 



^0 



> sup Eg 

see 



— £ 



> inf supEs [(s. 



Letting e — 0, ()4.5|) becomes a consequence of ()4.4p with i{u) = . □ 



9.3 Some additional proofs 

Proof of Corollary 13.61 The idea is to estimate the bias and the penalized 
term in ()3.5|) . Clearly, the dimension dm of is m{k + 1). Also, is 
bounded by (k + l)^m/(6 — a) (see (7) in pL2J), and 



E 



I]'^?,m(^)p(a^)^a;<(A; + l)m||s||, 



since the V5j,m,'s are orthonormal. On the other hand, by Chapter 2 (10.1) in 
[T7j . if s G (L^([o, &])), there is a polynomial q G iS^ such that 



Thus, 



|s - s^ll < C[„](6 - a) 2 f+"|s|Bo^(LP)m" 
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By (|3.5|) ). there is a constant M (depending on C, c, c', c", a, k, b — a, p, 
\s\b^(lp), and ||s||oo), for which 

E\\\s-~sP]<M inf |m-2" + -| + — . 

It is not hard to see that, for large enough T, the infimum on the above right 
hand side is 0«(T-2"/(2"+i)) (where 0„ means that the term depends only 
on a). Since M is monotone in |s|ga<_^(LP) and ||s||oo! ()3.10|) is verified. □ 

Verification of Remark 13. 2t Suppose that Dm is finite, and thus each 
/ G S", with 11/11 = 1 is bounded. It follows using Lagrange multipliers that, 
for each x G -D, 

{dm dm I dm 

\^Ci(fi{x)\^ : = 1 I = ^<^-(a;). 

i=l i=l J i=l 

Since Dm > D{x) for every x G D, we obtain Dm > || Yli=i ^fWoo- On the 
other hand, for every e > 0, there are 6i, . . . , 6„ satisfying ^^r^ hf = \ and 
an a; G -D such that 

dm dm dm 

2=1 i=l i=l 

Letting e ^ 0, it follows that Dm = \\ Eti ^ 
Proof of Lemma I9.lt Clearly, 7^) as defined by (j2.9j) can be written as 

E7d(/) = ll/ir - 2(/, sn) - 2voU) = \\f - sd\? - \\sd\? - 2Mf)- 

By the very definition of s as the penalized projection estimator and by 
Remark 

7d(5) + pen(m) < -foism) + pen(m) < 7(5^) + pen(m), 
for any m G A^. Using the previous two equations: 

P - sdT = 1d{s) + IIsdIP + 2ud{s) 

< li-Sm) + IIsdT + 2^'d(s) + pen(m) - pen(m) 
= Pm - SdW^ + 2ud{s - s:^) + pen(m) - pen(m). 
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Finally, notice that i^Dis-s:^) = ^D{s-s:j^) + UDis:j^-s:^) and uoism-si 



Am 



□ 



Verification of inequality (|9.4|) : Notice just that for any a,b,e > 0: 



a - V2ab - -b > — + - 6. 

3 - 1 + e \26 6/ 

Evaluating the integral in ()9.3|) for — /, we can write 



(9.42) 



P 



f{x)N{dx) > / f{x)fi{dx 



X 



X 



u\ 2u 



> 1 -e"". 



Using that < 



P 



f{x)N{dx) > —— / fix)fiidx 
X J- + ^ Jx 



/x 1/(2^)1/^(^^2;) and (10^ . 
1 /■ „ 

2i ^ 6 



> 1 - e-", 



□ 



which is precisely inequality ()9.4|) . 

Proof of Lemma 19.41 

Let be the positive part of Z. First, 

/•oo 

E [Z] < E = / P[Z > x]cix. 
Jo 

Since /i is continuous and strictly increasing, P[Z > x] < K exp{—h^^ (x)) , 
where is the inverse of h. Then, changing variables to m = h~^{x), 

l*oo i*oo /»oo 

/ P[Z > x]dx <K e-'^'^^^'^dx = K e''h'{u)du. 
Jo Jo Jo 

Finally, an integration by parts yields e^h'{u)du = h{u)e^^du. □ 

Proof of Proposition 16.21 

From the orthonormality property. 



E lis 



m I 



']=E« ( 



i=l 



5^ Var + E 



Hi.; 
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By remark inini 

lim E[/„(v9i,m)] =T ip{x)s{x)r]{dx), 
lim Yai [In{(pi,m)] =T (pl^{x)s{x)ri{dx). 



Then, ()6.5|) is true from ()2.14|) and ()2.15|) . The second statement in the proof 
is straightforward since 

^[\\si-sr] = E[\\si-sir] + \\si-sr. 

□ 



10 Figures 
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Penalized Estimation Projection 



Sample Path Information: 

Gamma Process with a = 1 and p = 1 

(2000 jumps on [0,365]) 



Method of Estimation: 
Regular Histograms 
c=2 

Estimation window = [0.02, 1 .0] 
Best partition = 51 Intervals 




Figure 1: Penalized projection estimation of ^ 
Penalized Projection Estimation 



Sample Path Information: 
Gamma Process with a = 2 and p = 0.5 
(2000 jumps on [0,365]) 



Method of Estimation: 
Regular Histograms 
c=2 

Estimation window = [0.05,1 .05] 
Best parttion = 37 intervals 




Figure 2: Penalized projection estimation of -exp 
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Penalized Projection Estimation 



— Projection estimator 

- Reai Gamma density; a=0.5 and p =2 

- - Estimated Gamma density: a.0.50353 and p. 1.7261 



Sample Path Information: 2000 jumps on [0,365] 
Gamma Process with a=0.5 and p =2 

Method of Estimation: 

Regular partition with c = 2 
Estimation Window = [0.05,1.05] 
Best partition = 18 




Figure 3: Penalized projection estimation of ^exp (— f)- 



Penaiized Projection Estimation 



1 \ 



- Projection estimator 

- Real Gamma density: ct=3 and p =3 

- - Estimated Gamma density: 0=1 .8753 and (1=4.4502 



Sample Path Information: 2000 jumps on [0,365] 
Gamma Process with a=3 and p =3 

Method of Estimation: 

Regular partition with c = 2 
Estimation Window = [0.05,5.05] 
Best partition = 19 




1.5 2 2.5 3 3.5 4 4.5 5 
X 



Figure 4: Penalized projection estimation of | exp (— |). 
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Penalized Projection Estimation 



— Projection estimator 

- Reai Gamma density: a=3 and p =3 

- - Estimated Gamma density: a=2.8893 and p =2.9 



Sample Path Information: 4000 jumps on [0,365] 
Gamma Process with a=3 and p =3 

Method of Estimation: 

Regular partition with c = 2 

Estimation Window = [0.05,5] 
Best partition = 91 




Figure 5: Penalized projection estimation of |exp (— |)- 



X 

"(01.5 



Penalized Projection Estimation 



- Projection estimator 

- Real Gamma density: 0=3 and p =3 

- - Estimated Gamma density; 0.3.2094 and p =2.6878 



Sample Path Information: 2000 jumps on [0,365] 

Gamma Process with a=3 and p =3 
Method of Estimation: 

Regular partition with c = 2 

Estimation Window = [1 .5,5] 

Best partition = 9 



Figure 6: Penalized projection estimation of | exp (— |). 
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Penalized Estimation Projection 











Sample Path Information: 






Gamma Process with a = 1 and p = 1 






(2000 jumps on [0,365]) 






1 Method of Estimation: 






1 Regular Histograms apply to Levy desity 






1 with respect to n(dx)= 1 /x^ dx. 












\ Estimation window = [0, 1 .0] 






\ Best partition = 5 intervals 






0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 


0.9 




X 





Figure 7: Regularized penalized projection estimation of 
(Deregularized) Penalized Projection Estimation 




0.5 1 1.5 2 2.5 3 

X 



Figure 8: Regularized penalized projection estimation of fexp (— |). 
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(Deregularized) Penalized Projection Estimation 



— Estimator 

- Real Gamma density: 0=0.5 and p =2 

- - Estimated Gamma density: 0=0.48255 and |)=2.1131 



Sample Path Information: 2000 jumps on [0,365] 

Gamma Process with a=0.5 and p =2 
Method of Estimation: 

Regular partitions with c = 2 

Estimation Window = [0,1.05] 

Best partition = 5 




0.1 0.2 0.3 



0.5 0.6 
X 



Figure 9: Regularized penalized projection estimation of 2^exp (— §)• 



P.P.E. of the left-tail of the Levy density 



— Projection estimator 

- - Estimated Gamma density: 0=525.851 3 and [J =0. 

— Real Gamma density: 0=500 and p =0.0037 

- Real pdf scaled by 1/A 



Sample Path Information: 

Variance Gamma P 

5000 increments on [0,2.4606 years ] 
Method of Estimation: 

Regular partition with c = 2 

Estimation Window = [0.005,0.02] 




Figure 10: Penalized projection estimation of the left-tail of the variance 
Gamma Levy density. 
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Figure 11: Penalized projection estimation of the right-tail of the variance 
Gamma Levy density. 

Sampling Distribution of the IVILE for a of the Gamma Process 




Figure 12: Sampling Distribution for the MLE of the a of the Gamma Levy 
Process. 
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Sampling Distribution of the lUILE for |3 of the Gamma Process 

200 1 1 1 1 1 1 1 1 1 




0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.25 
P 

Figure 13: Sampling Distribution for the MLE of the f3 of the Gamma Levy 
Process. 



Sampling Distribution of the PPE-LSE for a of the Gamma Process 




0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 



Figure 14: Samphng Distribution for the Estimates of the a of a Gamma 
Levy process obtained from the PPE and the LSE method. 
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Sampling Distribution of tiie PPE-LSE for |3 of tfie Gamma Process 

350 1 1 1 1 1 1 1 1 1 




0.5 1 1.5 2 2.5 3 3.5 4 4.5 

P 

Figure 15: Sampling Distribution for the Estimates of the /3 of a Gamma 
Levy process obtained from the PPE and the LSE method. 



Sampling Distribution of the PPE-LSE for the a of the VG Model 

250 1 1 1 1 1 1 1 1 1 1 




100 200 300 400 500 600 700 800 900 1000 1100 

a 



Figure 16: Samphng Distribution for the Estimates of a_ obtained from the 
PPE and the LSE method. 



63 



Sampling Distribution of the PPE-LSE for tt\e fi^ of tfie VG IVIodel 

Variance Gamma Process with 
0=500, p^=.0037, p =.0037 
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Figure 17: Sampling Distribution for the Estimates of (3+ obtained from the 
PPE and the LSE method. 



Method of Moments Estimators for the a of the VG Model 
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Figure 18: Samphng Distribution for the Estimator of a obtained by the 
Method of Moments. 
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Method of Moments Estimators for the B of the VG Model 
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Figure 19: Sampling Distribution for the Estimator of obtained by the 
Method of Moments. 
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